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O ■ Abstract 

- ^ , In this paper, we model dependence between operational risks by allowing risk 

^ • profiles to evolve stochastically in time and to be dependent. This allows for a 



flexible correlation structure where the dependence between frequencies of different 
risk categories and between severities of different risk categories as well as within 
risk categories can be modeled. The model is estimated using Bayesian inference 
methodology, allowing for combination of internal data, external data and expert 
opinion in the estimation procedure. We use a specialized Markov chain Monte 
Carlo simulation methodology known as Slice sampling to obtain samples from the 
resulting posterior distribution and estimate the model parameters. 

Keywords: dependence modelling, copula, compound process, operational risk, 
Bayesian inference, Markov chain Monte Carlo, Slice sampling. 
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1 Introduction 



Modelling dependence between different rislc cells and factors is an important challenge in 
operational risk (OpRisk) management. The difficulties of correlation modelling are well 
known and, hence, regulators typically take a conservative approach when considering 
correlation in risk models. For example, the Basel II OpRisk regulatory requirements 
for the Advanced Measurement Approach, BIS (2006) p. 152, states ^^Risk measures for 
different operational risk estimates must be added for purposes of calculating the regula- 
tory minimum capital requirement. However, the bank may be permitted to use internally 
determined correlations in operational risk losses across individual operational risk es- 
timates, provided it can demonstrate to the satisfaction of the national supervisor that 
its systems for determining correlations are sound, implemented with integrity, and take 
into account the uncertainty surrounding any such correlation estimates (particularly in 
periods of stress). The bank must validate its correlation assumptions using appropriate 
quantitative and qualitative techniques." 

The current risk measure specified by regulatory authorities is Value-at-Risk (VaR) 
at the 0.999 level for a one year holding period. In this case simple summation over 
VaRs corresponds to an assumption of perfect dependence between risks. This can be 
very conservative as it ignores any diversification effects. If the latter are allowed in 
the model, capital reduction can be significant providing a strong incentive to model 
dependence in the banking industry. At the same time, limited data does not allow for 
reliable estimates of correlations and there are attempts to estimate these using expert 
opinions. In such a setting a transparent dependence model is very important from the 
perspective of model interpretation, understanding of model sensitivity and with the aim 
of minimizing possible model risk. However, we would also like to mention that VaR is 
not a coherent risk measure, see Artzner, Delbaen, Eber and Heath (1999). This means 
that in principal dependence modelling could also increase VaR, see Embrechts, Neslehova 
and Wiithrich (2009) and Embrechts, Lambrigger and Wiithrich (2009). 

Under Basel II requirements, the financial institution intending to use the Advanced 
Measurement Approach (AMA) for quantification of OpRisk should demonstrate accuracy 
of the internal model within 56 risk cells (eight business lines times seven event types). To 
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meet regulatory requirements, the model should make use of internal data, relevant ex- 
ternal data, scenario analysis and factors reflecting the business environment and internal 
control systems. The definition of OpRisk, Basel II requirements and the possible Loss 
Distribution Approach for AMA were discussed widely in the literature, see e.g. Cruz 
(2004), Chavez-Demoulin, Embrechts and Neslehova (2006), Frachot, Moudoulaud and 
Roncalli (2004), Shevchenko (2009). It is more or less widely accepted that under the 
Loss Distribution Approach of AMA Basel II requirements, the banks should quantify 
distributions for frequency and severity of OpRisk for each business line and event type 
over a one year time horizon. These are combined into an annual loss distribution for the 
bank top level (as well as business lines and event types if required) and the bank capital 
(unexpected loss) is estimated using the 0.999 quantile of the annual loss distribution. If 
the severity and frequency distribution parameters are known, then the capital estimation 
can be accomplished using different techniques. In the case of single risks there are: hybrid 
Monte Carlo approaches, see Peters, Johansen and Doucet (2007); Panjer Recursions, see 
Panjer (1981); integration of the characteristic functions, see Luo and Shevchenko (2009); 
Fast Fourier Transform techniques, see e.g. Embrechts and Frei (2009), Temnov and 
Warnung (2008). To account for parameter uncertainty, see Shevchenko (2008), and in 
multivariate settings Monte Carlo methods are typically used. 

The commonly used model for an annual loss in a risk cell (business line/event type) 
is a compound random variable. 



Here t = 1,2,...,T, T + 1 in our framework is discrete time (in annual units) with T + 1 
corresponding to the next year. The upper script j is used to identify the risk cell. The 
annual number of events N^-^'^ is a random variable distributed according to a frequency 
counting distribution P^^\-\x[''^), typically Poisson, which also depends on time dependent 
parameter (s) A^''^ The severities in year t are represented by random variables Xs''\t), 
s > 1, distributed according to a severity distribution F(-^^(-|?/^p''), typically lognormal, 
WeibuU or generalized Pareto distributions with parameter (s) ipl''^ . Note, the index j on 
the distributions P^^^ and F^^^ reflects that distribution type can be different for different 
risks, for simplicity of notation we shall omit this j, using P(-|AS^'^) andF(-|^p'^), hereafter. 




■U) 



(1.1) 



s=l 
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The variables and V'j generically represent distribution (model) parameters of the 
jth j,jgj^ tliai we refer hereafter to as the risk profiles. Typically, it is assumed that given 
Ap^ and ipt''\ the frequency and severities of the j^'^ risk are independent and the severities 
within the j*^ risk are independent too. The total bank's loss in year t is calculated as 



where formally for OpRisk under the Basel II requirements J = 56 (seven event types 
times eight business lines). However, this may differ depending on the financial institution 
and type of problem. 

Conceptually under model (11. ip . the dependence between the annual losses Z^^'^ and 
Zf^\ i 7^ j, can be introduced in several ways. For example via: 

• Modelling dependence between frequencies A^^"''' and A^^*^ directly through e.g. cop- 
ula methods, see e.g. Frachot, Roncalli and Salomon (2004), Bee (2005) and Aue 
and Klakbrener (2006) or common shocks, see e.g. Lindskog and McNeil (2003), 
Powojowski, Reynolds and Tuenter (2002). We note that the use of copula methods, 
in the case of discrete random variables, needs to be done with care. The approach 
of common shocks is proposed as a method to model events affecting many cells at 
the same time. Formally, this leads to dependence between frequencies of the risks if 
superimposed with cell internal events. One can introduce the dependence between 
event times of different risks, e.g. the event time of the j'*'* risk correlated to 
the 1'^* event time of the i*'^ risk, etc., but it can be problematic to interpret such a 
model. 

• Considering dependence between severities (e.g. the first loss amount of the j*^ 
risk is correlated to the first loss of the i^'^ risk, second loss in the j^^ risk is corre- 
lated to second loss in the i*'' risk, etc), see e.g. Chavez-Demoulin, Embrechts and 
Neslehova (2006). This can be difficult to interpret especially when one considers 
high frequency versus low frequency risks. 

• Modelling dependence between annual losses directly via copula methods, see Gia- 
cometti, Rachev, Chernobai and Bertocchi (2008), Bocker and Kliippelberg (2008) 



J 




(1.2) 
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and Embrechts and Puccetti (2008). However, this may create irreconcilable prob- 
lems with modelling insurance for OpRisk that directly involves event times. Addi- 
tionally, it will be problematic to quantify these correlations using historical data, 
and the LDA model (11.11) will loose its structure. Though one can consider depen- 
dence between losses aggregated over shorter periods. 

In this paper, we assume that all risk profiles are stochastically evolving in time. That 
is we model risk profiles Xt = ^Aj^\ . . . , Aj"^'' j and Vt = (^i^t^\ ■ ■ ■ : i^t"^^^ by random 
variables At = (^A[^\ . . . , Aj'^^'j and = (^^^^ • • • , "^t^^^ , respectively. We introduce 
dependence between risks by allowing dependence between their risk profiles At and 
Note that, independence between frequencies and severities in (11.11) is conditional on risk 
profiles {At, ^t) only. Additionally we assume, for the sake of simplicity, that all risks are 
independent conditional on risk profiles. 

Stochastic modelling of risk profiles may appeal to intuition. For example consider 
the annual number of events for the j*'* risk modelled as random variables from Poisson 
distribution Poi (^A[^''^ = X['^^^. Conditional on Aj"''', the expected number of events per 
year is A^''^ The latter is not only different for different banks and different risks but also 

(i) 

changes from year to year for a risk in the same bank. In general, the evolution of A^ , 
can be modelled as having deterministic (trend, seasonality) and stochastic components. 
In actuarial mathematics this is called a mixed Poisson model. For simplicity, in this 
paper, we assume that A|"'^ is purely stochastic and distributed according to a Gamma 
distribution. 

Now consider a sequence (Ai, ^i) , . . . , (A-r+i, ^y+i). It is naive to assume that risk 
profiles of all risks are independent. Intuitively these are dependent, for example, due 
to changes in politics, regulations, law, economy, technology (sometimes called drivers or 
external risk factors) that jointly impact on many risk cells at each time instant. In this 
paper we focus on dependence between risk profiles. 

We begin by presenting the general model and then we perform analysis of relevant 
properties of this model in a bivariate risk setting. Next, we demonstrate how to perform 
inference under our model by adopting a Bayesian approach that allows one to combine 
internal data with expert opinions and external data. We consider both the single risk 
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and multiple risk settings for the example of modelling claims frequencies. Then we 
present an advanced simulation procedure utilizing a class of Markov chain Monte Carlo 
(MCMC) algorithms which allow us to sample from the posterior distributions developed. 
Finally, we demonstrate the performance of both the model and the simulation procedure 
in several examples, before finishing with a discussion and conclusions. 

The main objective of the paper is to present the framework we develop for the mul- 
tivariate problem and to demonstrate estimation in this setting. Application of real data 
is the subject of further research. To clarify notation, we shall use upper case symbols to 
represent random variables, lower case symbols for their realizations and bold for vectors. 



2 Model 

Model Assumptions 2.1 Consider J risks each with a general model ( f i. il) for the an- 
nual loss in year t, Z\^\ and each modelled by severity Xs''\t) and frequency n[^\ The 
frequency and severity risk profiles are modelled by random vectors A^ = (A^"^'', . . . , A^"''') 
and = . . . , respectively and parameterized by risk characteristics 0\ = 

{0^j^\ . . . , 0^/^) and 6^ = {9^\ . . . , 6*^'*) correspondingly. Additionally, the dependence 
between risk profiles is parameterized by Op. Assume that, given = {0\, 0^, Op): 

1. The random vectors, 

(*i,Ai,A^P\xi^)(l);j = l,...,J,s>l) 

(*T+i, At+1, A^i^li, Xi^) (T + 1) ; J = 1, . . . , J, s > l) 

are independent. That is, given 0, between different years, the risk profiles for 
frequencies and severities as well as the number of losses and actual losses are 
independent. 

2. The vectors (^'i, Ai) , . . . , (^^t+i, Ay+i) are i.i.d. from a joint distribution with 
marginal distributions Ap-* ~ G (y\0'^l^^ , ^\^^ ~ H ^■|^^''^ and 2 J -dimensional cop- 
ula C{-\Op). 
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3. Given At = At and ^'t = ij^t- the compound random variables z[^\ . . . , z['^^ are 



independent with and Xj; (t) , X2 (t) , . . . independent; frequencies 



P and seventies xi^^ (t) F (^■\ipi^'^^ , s > 1. 



Calibration of the above model requires estimation of 0. A thorough discussion about 
the interpretation and role of is provided in Section 4, where it will be treated within a 
Bayesian framework as a random variable to incorporate expert opinions and external 
data into the estimation procedure. Also note that for simplicity of notation, we assumed 
one severity risk profile and one frequency risk profile A^'^'' per risk - extension is 
trivial if more risk profiles are required to model risk. 

Copula models. To define the above model, a copula function C(-) should be speci- 
fied to model dependence between the risk profiles. For a description of copulas in the 
context of financial risk modelling see McNeil, Frey and Embrechts (2005). In general, a 
copula is a d-dimensional multivariate distribution on [0, 1]"' with uniform marginal dis- 
tributions. Given a copula function C{ui, . . . , Ud), the joint distribution of rvs Yi, . . . ,Yd 
with marginal distributions . . . , Fdiyd) can be constructed as 

F(yi,...,yrf) = C(Fi(yi),...,Frf(2/,)). (2.1) 

A well known theorem due to Sklar, published in 1959, says that one can always find a 
unique copula C{ ) for a joint distribution with given continuous marginals. Note that in 
the case of discrete distributions this copula may not be unique. Given (12. ip . the joint 
density can be written as 

d 

fiVu ...,yd) = c(Fi(yi), . ■ . , Fdivd)) n My^)- (2-2) 

i=l 

where c(-) is a copula density and /i(?/i), . . . , fdijjd) are marginal densities. In this paper, 
for illustration purposes we consider the Gaussian, Clayton and Gumbel copulas (Clayton 
and Gumbel copulas belong to the so-called family of the Archimedean copulas): 



Gaussian copula: 



/^(F/(ni),...,F/(n,)) 



c(ui, . . . ,Ud|S) = , (2.3) 



where -F/v(-) and /7v(-) are the standard Normal distribution and its density respec- 
tively and /^(■) is a multivariate Normal density with zero means, unit variances 
and correlation matrix S. 



Clayton copula: 

c{ui,...,Ud\p) = ( + 

i=l 



p a 



Ua 



n((n.)-''-M(z-l)p+l}), (2.4) 



where p > is a dependence parameter. 
Gumbel copula: 

c(mi, . . . ,Ud\p) 



Qd 



dui . . . dud 



C (mi, . . .,Ud\p) 



C{ui,...,Ud\p) = exp|- (^^.^^(-log(Mi))^j'' 
where p > 1 is a dependence parameter. 

In the bivariate case the explicit expression for Gumbel copula is given by 



(2.5) 
(2.6) 



c{ui,U2\p) 



l + (p-l) login,))' 



24-1 



[log (mi) log (wa)] 



p-i 



An important difference between these three copulas is that they each display different 
tail dependence properties. The Gaussian copula has no upper or lower tail dependence, 
the Clayton copula produces lower tail dependence, whereas the Gumbel copula produces 
upper tail dependence, see McNeil, Frey and Embrechts (2005). 



Common factor models. The use of common (systematic) factors is useful to identify 
dependent risks and to reduce the number of required correlation coefficients that must 
be estimated. For example, assuming a Gaussian copula between risk profiles, consider 
one common factor Qt affecting all risk profiles as follows 



PiQt + 



(0 , 



piw, 



1, 



2 J; 



A 



(i) 



G-\F^{Y}'^)\e'i^),^['^ = H-\Fr,{Yl^^'^)\e'4^),j = 1, . . . , J, (2.7) 



r(i) 



where Wl'\...,Wr> and fi* are iid from the standard Normal distribution and all rvs 
are independent between different time steps t. Given Qt, all risk profiles are independent 
but unconditionally the risk profiles are dependent if the corresponding pi are nonzero. 
In this example, one should identify 2J correlation parameters pi only instead of J( J — 
l)/2 parameters of the full correlation matrix. Often, common factors are unobservable 
and practitioners use generic intuitive definitions such as: changes in political, legal and 
regulatory environments, economy, technology, system security, system automation, etc. 
Several external and internal factors are typically considered. The factors may affect the 
frequency risk profiles (e.g. system automation), the severity risk profiles (e.g. changes in 
legal environment) or both the frequency and severity risk profiles (e.g. system security). 
For more details on the use and identification of the factor models, see Section 3.4 in 
McNeil, Frey and Embrechts (2005); also, see Sections 5.3 and 7.4 in Marshall (2001) for 
the use in the operational risk context. 

In general, a copula can be introduced between all risk profiles. Though, for simplicity, 
in the simulation examples below, presented for two risks, we consider dependence between 
severities and frequencies separately. Also, in this paper, the estimation procedure is 
presented for frequencies only. The actual procedure can be extended in the same manner 
as presented to severities but it is the subject of further work. 

3 Simulation Study - Bivariate Case 

We start with a bivariate model, where we study the strength of dependence at the 
annual loss level obtained through dependence in risk profiles, as discussed above. We 
consider two scenarios. The first involves independent severity risk profiles and depen- 
dent frequency risk profiles. The second involves dependence between the severity risk 
profiles and independence between the frequency profiles. In both scenarios, we consider 
three bivariate copulas (Gaussian, Clayton and Gumbel copulas fl2.3p - fl2.6l) ) denoted as 
C{ui,U2\p) and parameterized by one parameter p which controls the degree of depen- 
dence. In the case of Gaussian copula, p is a non-diagonal element of correlation matrix 
S in (I23|) . 
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Bivariate model for risk profiles. We assume that Model Assumptions 2.1 are fullfilled 

for the aggregated losses 

zi'' = j:^i'Ht) and Z?^ = J2xPit). 

s=l s=l 

As marginals, for j = 1, 2 we choose: 

• N^'^ ~ Pot (^X?^ and xi'\t) LN s > 1. 
. A? ~ r ), ~ iV (/i?,4^)). 

Here, r(a,/5) is a Gamma distribution with mean a/ (3 and variance a/ (3"^, N{fi,a) is a 
Gaussian distribution with mean /x and standard deviation a, and LN [n, a) is a lognormal 
distribution. 

In analyzing the induced dependence between annual losses, we consider two scenarios: 

• Scenario 1: A^^^* and A^-* are dependent via copula C{ui,U2\p) while \E'[^^ and \E^P^ 
are independent. 

• Scenario 2: "^[^^ and \E'j^'* are dependent via copula C{ui,U2\p) while A^^^^ and A[^^ 
are independent. 

Here, parameter p corresponds to 6p in Model Assumptions 2.1. The simulation of the 
annual losses when risk profiles are dependent via a copula can be accomplished as shown 
in Appendix A. Utilizing this procedure, we examine the strength of dependence between 
the annual losses if there is a dependence between the risk profiles. In the next sections 
we will demonstrate the Bayesian inference model and associated methodology to perform 
estimation of the model parameters. Here, we assume the parameters are known a priori 
with the following values used in our specific example: 

• a^i^ = 5,pil^ = 0.1, p^i^ = 2,ujf = 0.4, a(^) = l;j = 1,2 

These parameters correspond to and 0^, in Model Assumptions 2.1. In Figured], we 
present three cases where C (-Ip) is a Gaussian, Clayton or Gumbel copula under both 
scenario 1 and scenario 2. In each of these examples we vary the parameter of the copula 
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model p from weak to strong dependence. The annual losses are not Gaussian distributed 
and to measure the dependence between the annual losses we use a non-linear rank corre- 
lation measure, Spearman's rank correlation, denoted by psr{zI^\ zf''*). The Spearman's 
rank correlation between the annual losses was estimated using 10, 000 simulated years for 
each value of p. In these and other numerical experiments we conducted, the range of pos- 
sible dependence between the annual losses of different risks induced by the dependence 
between risk profiles is very wide and should be flexible enough to model dependence 
in practice. Note, the degree of induced correlation can be further extended by working 
with more flexible copula models at the expense of estimation of a larger number of model 
parameters. 



4 Bayesian Inference: combining different data sources 

In this section we estimate the model introduced in Section 2 using a Bayesian inference 
method. To achieve this we must consider that the requirements of Basel II AMA (see 
BIS, p. 152) clearly state that: "Any operational risk measurement system must have 
certain key features to meet the supervisory soundness standard set out in this section. 
These elements must include the use of internal data, relevant external data, scenario 
analysis and factors reflecting the business environment and internal control systems" . 
Hence, Basel II requires that OpRisk models include use of several different sources of 
information. We will demonstrate that to satisfy such requirements it is important that 
methodology such as the one we develop in this paper be considered in practice to ensure 
one can soundly combine these different data sources. 

It is widely recognized that estimation of OpRisk frequency and severity distributions 
cannot be done solely using historical data. The reason is the limited ability to predict 
future losses in a banking environment which is constantly changing. Assume that a new 
policy was introduced in a financial institution with the intention of reducing an OpRisk 
loss. This cannot be captured in a model based solely on historical loss data. 

For the above reasons, it is very important to include Scenario Analysis (SA) in OpRisk 
modelling. SA is a process undertaken by banks to identify risks; analyze past events ex- 
perienced internally and jointly with other financial institutions including near miss losses; 
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consider current and planned controls in the banks, etc. Usually, it involves surveying 
of experts through workshops. A template questionnaire is developed to identify weak- 
nesses, strengths and other factors. As a result an imprecise, value driven quantitative 
assessment of risk frequency and severity distributions is obtained. On its own, SA is very 
subjective and we argue it should be combined (supported) by actual loss data analysis. 
It is not unusual that correlations between risks are attempted to be specified by experts 
in the financial institution, typically via SA. 

External loss data is also an important source of information which should be incorpo- 
rated into modelling. There are several sources available to obtain external loss data, for 
a discussion on some of the data related issues associated with external data see Peters 
and Teruads (2007). 

Additionally, the combination of expert opinions with internal and external data is a 
difficult problem and complicated ad-hoc procedures are used in practice. Some prominent 
risk professionals in industry have argued that statistically consistent combining of these 
different data sources is one of the most pertinent and challenging aspects of OpRisk 
modelling. It was quoted in Davis (2006) "Another big challenge for us is how to mix the 
internal data with external data; this is something that is still a big problem because I don't 
think anybody has a solution for that at the moment" and "What can we do when we don't 
have enough data [■ ■ ■ ] How do I use a small amount of data when I can have external 
data with scenario generation? /■■■// think it is one of the big challenges for operational 
risk managers at the moment. ". Using the methodology that we develop in this paper, 
one may combine these data sources in a statistically sound approach, addressing these 
important practical questions that practitioners are facing under Basel II AMA. 

Bayesian inference methodology is well suited to combine different data sources in 
OpRisk, for example see Shevchenko and Wiithrich (2006). A closely related credibility 
theory toy example was considered in Biihlmann, Shevchenko and Wiithrich (2007). We 
also note that in general questions of Bayesian model choice must be addressed, adding 
to this there is the additional complexity that estimation of the required posterior distri- 
butions will typically require MCMC, see Peters and Sisson (2006). 

A Bayesian model to combine three data sources (internal data, external data and 
expert opinion) for the case of a single risk cell was presented in Lambrigger, Shevchenko 
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and Wiithrich (2007). In this paper we extend this approach to the case of many risk cells 
with the dependence between risks introduced as in Section 2. Hereafter, for illustrative 
purposes we restrict to modelling frequencies only. 

Hence, our objective will be to utilise Bayesian inference to estimate the parameters 
of the model through the combination of expert opinions and observed loss data (internal 
and external). 

We note that as part of this Bayesian model formualtion an information flow can be 
incorporated into the model. This could be introduced in many forms. The most obvious 
example involves incorporation of new data from actual observed losses. However, we 
stress that more general ideas are possible. For example, if new information becomes 
available (new policy introduced, etc) then experts can update their prior distributions 
to incorporate this information into the model. 

Additionally, under a Bayesian model we note that SA could naturally form part of a 
subjective Bayesian prior elicitation procedure, see O'Hagan (2006). 

4.1 Modelling frequencies for a single risk cell 

Here we follow the Lambrigger, Shevchenko and Wiithrich (2007) approach to combine 
different data sources for one risk cell in the case of the Model Assumptions 12. 1[ 

Define a model in which every risk cell of a financial company j G {!,..., J} is 
characterized by a risk characteristic 6^^ that describes the frequency risk profile Ap'' 
in risk cell j. This 6^^ represents a vector of unknown distribution parameters of risk 
profile A[''\ The true value of 6^^ is not known and modelled as a random variable. A 
priori, before having any company specific information, the prior distribution of 6^^"* is 
based on external data only. Our aim then is to specify the distribution of Q^f' when we 
have company specific information about risk cell j such as observed losses and expert 
opinions. This is achieved by developing a Bayesian model and numerical estimation of 
relevant quantities is performed via MCMC methods. For simplicity, in this section, we 
drop the risk cell specific superscript j since we concentrate on modelling frequencies for 
single risk cell j, where Q^/^ is a scalar 6a and all other parameters are assumed known. 

Model Assumptions 4.1 Assume that risk cell j has a fixed, deterministic volume V 
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(i.e. number of transactions, etc.). 



1. The risk characteristics 0a of risk cell j has prior distribution: 0a ~ r(a,6) for 
given parameters a > and b > 0. 

2. Given 0a = 6'a, (Ai, A''i), . . . , (A^^+i, Nt+i) are i.i.d. and the intensity of events of 
year t E {1,...,T + 1} has conditional marginal distribution At ~ r(a, a/6\) for a 
given parameter a > 0. 

3. Given = ^a (I'nd At = Xt, the frequencies Nt ~ Poi{VXt). 

4. The financial company has K expert opinions A^., k = 1,. . . ,K about 0a. Given 
0A = ^Aj and {At,Nt) are independent for all k and t, and Ai,...,Aft: are 
i.i.d. with Afc ~ r(^,^/^A)- 

Remarks 4.2 

• In items 1) and 2) we choose a gamma distribution for the underlying parameters. 
Often, the available data is not sufficient to support such a choice. In such cases, in 
actuarial practice, one often chooses a gamma distribution. A gamma distribution 
is neither conservative nor aggressive and it has the advantage that it allows for 
transparent model interpretations. If other distributions are more appropriate then, 
of course, one should replace the gamma assumption. This can easily be done in 
our simulation methodology. 

• Given that 0a ~ r(a, 6), -E [0a] = ci/b and yar (0a) = These are the prior 
two moments of the underlying risk characteristics 0a. The prior can be determined 
by external data (or regulator). In general, parameters a and b can be estimated by 
the maximum likelihood method using the data from all banks. 

• Note that we have for the first moments 



E[At\QK] 



0A, E[At] = -, E[Nt\Q^,At] = V K 
V 0A, E [Nt] = V^. 



E[Nt\eA] 
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The second moments are given by 

2 

Var{At\eA) = el, Var (At) = a"' ^ + (a^' + 1) ^, 
Var{N,\eA,At) = V A^, Var {Nt\ Q^) = V + a'' (4.1) 

2 

VariN,) = V^ + V'a-'^ + V'{a~' + l)^. 

For model interpretation purposes, consider the results for the coefficient of variation 
(CV), a convenient dimensionless measure of uncertainty commonly used in the 
insurance industry: 

lim CV^ (AT, I Ga) = lim ^.T.^^j'^f = a"^ > 0, (4.2) 



and 



lim CV^ (Nt) = lim !'!!\' = + + l) > 0. (4.3) 

V^oo V^oo [Nt] ^ ^ 

That is, our model makes perfect sense from a practical perspective. Namely, as 
volume increases, V ^ oo, there always remains a non- diver sifiable element, see 
(14.21) and (14.31) . This is exactly what has been observed in practice and what reg- 
ulators require from internal models. Note, if we model as constant and known 
then limy^oo CV^ {Nt\At) 0. 

Contrary to the developments in Lambrigger, Shevchenko and Wiithrich (2007), 
where the intensity was constant overtime, now A^ is a stochastic process. From 
a practical point of view, it is not plausible that the intensity of the annual counts is 
constant over time. In such a setting parameter risks completely vanish if we have 
infinitely many observed years or infinitely many expert opinions, respectively (see 
Theorem 3.6 (a) and (c) in Lambrigger, Shevchenko and Wiithrich (2007)). This is 
because At can then be perfectly forecasted. In the present model, parameter risks 
will also decrease with increasing information. As we gain information the posterior 
standard deviation of Ba will converge to 0. However, since At+i viewed from time 
T is always random, the posterior standard deviation for A^+i will be finite. 

Note that, conditionally given 0a = 0\, Nt has a negative binomial distribution 
with probability weights for n > 0, 
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That is, at this stage we could directly work with a negative binomial distribution. 
As we will see below, only in the marginal case can we work with (4.4). In the 
multidimensional model we require Aj. 

• Ak denotes the expert opinion of expert k which predicts the true risk characteristics 
0A of his company. We have 

E[Ak\eA] = E[A,\Qa]=E[N^/V\Qa] = Qa, 
Var (AklQA) = ei/e, CV{Ak\eA)=r'^'. (4.5) 

That is, the relative uncertainty CV in the expert opinion does not depend on the 
value of 0A. That means that C, can be given externally, e.g. by the regulator, who 
is able to give a lower bound to the uncertainty. Moreover, we see that the expert 
predicts the average frequency for his company. Alternatively, ^ can be estimated 
using method of moments as presented in Lambrigger, Shevchenko and Wiithrich 
(2007). 

Denote Ai^r = (Ai,...,At), A^i:t = {Ni,...,Nt) and Ai;^ = (Ai,...,Ax). Then 
the joint posterior density of the random vector (Oa,Ai.7-) given observations A'^i = 
77-1, ... , Nt = riT, Ai = . . . , Ak = 5k is by Bayes' Theorem 

Vr(6'A, \l:T\ni:T, 6i;k) oc tt {ui.tIOa, Xi;t) tt {Xi:t\Oa) tt {Si:k\Oa) tt (6'a) . (4.6) 

Here, the likelihood terms and the prior are made explicit, 

T 



7l{n^.,T\9A,\l■.T)n{\^..T\9A) = ]^ i^^M!!i^Z^Ar^ exp {-A* (V + a/^A)} , (4.7) 



ritl r(a) 

K 



7riS,..K\9A) = Yl^^^St' exp{-6,^/9A}, (4.8) 

7r(^A) = f^^r' exp{-^A&}. (4.9) 

Note that the intensities Ai, . . . , A^- are non-observable. Therefore we take the integral 
over their densities to obtain the posterior distribution of the random variable Ga given 
{Ni.,t,Ai.,k) 
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(4.10) 

Given Oa, the distribution of the number of losses Nt is negative binomiaL Hence, 
one could start with a negative binomial model for A^^. The reason for the introduction 
of the random intensities At is that we will utilize them to model dependence between 
different risk cells, by introducing dependence between A[^\ . . . , A^"^''. 

Typically, a closed form expression for the marginal posterior function of Ga, given 
{Ni-T, Ai-x) can not be obtained, except in this single risk cell setting. In general, we will 
integrate out the latent variables Ai, . . . , At numerically through a MCMC approach to 
obtain an empirical distribution for the posterior of tt (^a|'^1:T; <^1:A')- This empirical poste- 
rior distribution then allows for the simulation of A^+i and A^^^+i, respectively, conditional 
on the observations {Ni-_t, '^i-.k)- 

4.2 Modelling frequencies for multiple risk cells 

As in the previous section we will illustrate our methodology by presenting the frequency 
model construction. In this section we will extend the single risk cell frequency model 
to the general multiple risk cell setting. This will involve formulation of the multivariate 
posterior distribution. 

Model Assumptions 4.3 (multiple risk cell frequency model) Consider J risk cells 
Assume that every risk cell j has a fixed, deterministic volume V^^\ 

1. The risk characteristic ©a = (0a\ • • • , ©a^^) has a J-dimensional prior density 
tt^Oa). The copula parameters Op are modelled by a random vector @p with the 
prior density n (Bp); ©a d'nd ©p are independent. 
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2. Given ©a = and 0^ = Bp -. (Ai, . . . , (A^+i, A^t+i) '^'^e i.i.d. and the inten- 
sities A( = (A[^\ . . . , A^"''') have a J -dimensional conditional density with marginal 
distributions A^^ ~ G [■\0f^ = T (^a^^), a^^V^?) the copula c(-|0p). Thus the 
joint density of At is given by 

7r(A,|0A, Op) = c (G(Af Vi'^ • • • , G(AS^Vf )l^p) nvr(AFVi^'^ (4.11) 

where vr ^-I^a^^ denotes the marginal density. 

3. Given ©a = ^a ^''^d At = Xt, the number of claims are independent with 
N^^^ ~ Pot{V^^^XP),j = 1, . . . , J. 

4. There are expert opinions A.^. = (A^^\ . . . , A^.'^^), k = 1, . . . ,K. Given ©a = 0^: 
Afc and {At, Nt) are independent for all k and t; and A^"'^ are all independent with 
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Prior Structure tt (^a) and vr [Op) . In the following examples, a priori, the risk char- 
acteristics 0A^ are independent Gamma distributed: 0a'' ~ T{a^^\h'^^'^) with hyper- 
parameters a'--'^ > and 6*^-'^ > 0. This means that a priori the risk characteristics for the 
different risk classes are independent. That is, if the company has a bad risk profile in risk 
class j then the risk profile in risk class i need not necessarily also be bad. Dependence 
is then modelled through the dependence between the intensities A^^"*, . . . ,A[^\ If this 
is not appropriate then, of course, this can easily be changed by assuming dependence 
within ©A. 



18 



In the simulation experiments below we consider cases when the copula is parameter- 
ized by a scalar Op. Additionally, we are interested in obtaining inferences on Op implied 
by the data only so we use an uninformative constant prior on the ranges [-1,1], (0,30] 
and [1,30] in the case of Gaussian, Clayton and Gumbel copulas respectively. 
Posterior density. The marginal posterior density of random vector (©a, 6p) given 
data of counts Ni = rii, . . . , = rix and expert opinions Ai = 6i, . . . , = Sk is 



T 

71 {0A,Op\ni,T,di.,K) = Y[ / iO/^,0p,Xt\ni.,T,di.,K)dXt 

t=i ^ 

ocflif nexp {-1/(^)aF} ^^'f^"'' n{X,\e„Op) dX, 
t=i V j=i < • 

exp{-6i^^^<^^yof 



K J 

-nn 

k=i j=i 
J 



r(eo-)) 



.7 = 1 



5 Simulation Methodology - Slice sampler 



(4.12) 



Posterior fl4.12p involves integration and sampling from this distribution is difficult. Here 
we present a specialized MCMC simulation methodology known as a Slice sampler to 
sample from the desired target posterior distribution niO^, Op, Xi,x\ni,T, 6i,k)- Marginally 
taken samples of ©a and Op are samples from n {6a, Op\ni;T, Si.k) which can be used to 
make inference for required quantities. 

It will be convenient to define the exclusion operators, A^^'~''\ ^i:T\k and ©^""'^ For 
example: 
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Sampling from ir (0\, Op, \i,T\ni,T^ ^i-.k) or vr {O^, Op\ni,T, ^i-.k) via closed form inversion 
sampling or via rejection sampling is not typically an option. There are many reasons for 
this. Firstly, only for specific copula models will closed form tractable expression for the 
marginal vr (0a, 9p\ni-T, Si-k) be attainable, certainly not for the models we consider in this 
paper. Secondly, even for the expression of the joint posterior n {6\, 9p, Xi,T\ni,T, Si.k) 
it is typically only possible to sample from the conditional distributions sequentially via 
numerical inversion sampling techniques which is highly computational and inefficient in 
high dimensions. Additionally, we would like a technique which is independent of the 
potentially arbitrary choice in specifying a copula function for the dependence between 
Af'\ . . . , A["''*. Hence, we utilize an MCMC framework which we make general enough to 
work for any choice of copula model, developed next. 

5.1 Bayesian Parameter Estimation 

We separate the analysis into two parts. Firstly, we condition on knowledge of the copula 
parameters 9p = p, where p denotes the true copula parameters used to generate the data. 
This allows us to demonstrate that if the copula parameters 6p are known, we can perform 
estimation of other parameters accurately under joint inference. The second part involves 
joint estimation of 6p and 0^ to demonstrate the accuracy of the joint inference procedure 
developed. Note that, the model for this second part has not been formally introduced 
but is a simple extension of Model Assumptions 14.31 

5.1.1 Conditional on a priori knowledge of copula parameter 

Here we assume the copula parameter has been estimated a priori and so estimation only 
involves model parameters. Such a setting may arise for example if the copula parameter 
is already estimated via a ML estimator. The proposed sampling procedure we develop 
is a particular class of algorithms in the toolbox of MCMC methods. It is an alternative 
to a Gibbs sampler known as a univariate Slice sampler. We note that to implement 
a Gibbs sampler or a univariate Slice sampler one needs to know the form of the full 
conditional distributions. However, unlike the basic Gibbs sampler the Slice sampler does 
not require sampling from these full conditional distributions. Derivations of the posterior 
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full conditionals, 

7r(^i^>i"^'\ X,.,T, n,.,r, 6,.,k, 6,) cx 7r{X,.,T\e^~'\ of, e,)7r{S,.,K\0^~'\ 6^1^) 

X7r(<-^>J-V(^?), (5.1) 
7r(ApVA,AS:^'-^'Ui:T,5i:X,^p) oc n {n,..T\X^~^'~'\ X?)7r{Xi-'\ x['^ \0^, 6 ,) (5.2) 

are presented in Appendix B. 

5.1.2 Joint Inference of marginal and copula parameters 

To include the estimation of the copula parameter 6p jointly with the parameters ©a and 
latent intensities in our Bayesian framework, we assume that it is constant in time 
and model it by a random variable 0p with some prior density 7^(0 p). The full conditional 
posterior of the copula parameter, denoted vr {9p\6\, Xi-^t, ni,T, Si-^k) , is given by 

71 {9p\6a, Xi;T, ni.,T, Si;k) oc 7i{Xi;t\0a, dp)Ti {dp) , (5.3) 

For the full derivation of the scalar case see. Appendix B. 

In the following section we provide intuition for our choice of univariate Slice sampler 
as compared to alternative Markov chain algorithms. In particular we describe the ad- 
vantages that the Slice sampler has compared to more standard Markov chain samplers, 
though we also point out the additional complexity involved. We verify the validity of 
the Slice sampling algorithm for those not familiar with this specialized algorithm and 
we then describe some intricacies associated with implementation of the algorithm. This 
is followed by a discussion of some extensions we developed when analyzing the OpRisk 
model. The technical details of the actual algorithm are provided in Appendix C. 

5.2 Slice sampling 

The full conditionals given in equations (15.11) . (15.21) . (15. 3p do not take standard explicit 
closed forms and typically the normalizing constants are not known in closed form. There- 
fore this will exclude straightforward inversion or basic rejection sampling being used to 
sample from these distributions. Therefore one may adopt a Metropolis Hastings (MH) 
within Gibbs sampler to obtain samples, see for example Gilks, Richardson and Spiegel- 
halter (1996) and Robert and Casella (2004) for detailed expositions of such approaches. 
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To utilize such algorithms it is important to select a suitable proposal distribution. Quite 
often in high dimensional problems such as ours, this requires tuning of the proposal 
for a given target distribution. Hence, one incurs a significant additional computational 
expense in tuning the proposal distribution parameters off-line so that mixing of the 
resulting Markov chain is sufficient. An alternative not discussed here would include an 
Adaptive Metropolis Hastings within Gibbs sampling algorithm, see Atachade and Rosen- 
thal (2005) and Rosenthal (2009). Here we take a different approach which utilizes the 
full conditional distributions, known as a univariate Slice sampler, see Neal (2003). We 
demonstrate how effective a univariate Slice sampler is for our model. 

Slice sampling was developed with the intention of providing a "black box" approach 
for sampling from a target distribution which may not have a simple form. The Slice 
sampling methodology we develop will be automatically tailored to the desired target 
posterior. As such it does not require pretuning and in many cases will be more efficient 
than a MH within Gibbs sampler. The reason for this, pointed out by Neal (2003), is that 
a MH within Gibbs has two potential problems. The first arises when a MH approach 
attempts moves which are not well adapted to local properties of the density, resulting in 
slow mixing of the Markov chain. Secondly, the small moves arising from the slow mixing 
typically lead to traversal of a region of posterior support in the form of a Random Walk. 
Therefore, steps are required to traverse a distance that could be traversed in only 
L steps if moving consistently in the same direction. A univariate Slice sampler can 
adaptively change the scale of the moves proposed avoiding problems that can arise with 
the MH sampler when the appropriate scale of proposed moves varies over the support of 
the distribution. 

A single iteration of the Slice sampling distribution for a toy example is presented in 
Figure [H The intuition behind Slice sampling arises from the fact that sampling from 
a univariate distribution p (6) can always be achieved by sampling uniformly from the 
region under the distribution p (9) . Obtaining a Slice sample follows two steps: sample 
a value ui ~ U [0,p {6i_i)] and then sample a value uniformly from Ai, 6i ~ f/ [A;] . 
This procedure is repeated and by discarding the auxiliary variable sample ui one obtains 
correlated samples 6'iS fromp (di-i)- Neal (2003), demonstrates that a Markov chain [U, O) 
constructed in this way will have stationary distribution defined by a uniform distribution 
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under p {6) and the marginal of has desired stationary distribution p (6) . Additionally, 
Mira and Tierney (2002) proved that the Slice sampler algorithm, assuming a bounded 
target distribution p (6) with bounded support, is uniformly ergodic. 

Similar to a deterministic scan Gibbs sampler, the simplest way to extend the Slice 
sampler to a multivariate distribution is by considering each full conditional distribution 
in turn. Note, discussion relating to the benefits provided by Random Walk behaviour 
suppression, as achieved by the Slice sampler, are presented in the context of non- reversible 
Markov chains, see Diaconis, Holmes and Neal (2000). 

Additionally, we only need to know the target full conditional posterior up to nor- 
malization, see Neal (2003) p. 710. This is important in this example since solving the 
normalizing constant in this model is not possible analytically. To make more precise 
the intuitive description of the Slice sampler presented above, we briefly detail the ar- 
gument made by Neal on this point. Suppose we wish to sample from a distribution 
for a random vector G M" whose density p (0) is proportional to some function / (0). 
This can be achieved by sampling uniformly from the (n + l)-dimensional region that 
lies under the plot of / (0). This is formalised by introducing the auxiliary random vari- 
able U and defining a joint distribution over and U which is uniform over the region 
{(0, f/) : < u < f (6)} below the surface defined by / (6), given by 



as required. There are many possible procedures to obtain samples of (0,f/). The details 
of the implemented algorithm undertaken in this paper are provided in Appendix C. 

Extensions 

We note that in the Bayesian model we develop, in some cases strong correlation between 
the parameters of the model will be present in the posterior, see Figure |3l In more ex- 
treme cases, this can cause slow rates of convergence of a univariate sampler to reach the 
ergodic regime, translating into longer Markov Chain simulations. In such a situation 




if < M < / (0) , 
otherwise. 



where Z = j f (6) dO . Then the target marginal density for is given by 
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several approaches can be tried to overcome this problem. The first involves the use of 
a mixture transition kernel combining local and global moves. For example, we suggest 
local moves via a univariate Slice sampler and global moves via an Independent Metropo- 
lis Hastings (IMH) sampler with adaptive learning of its covariance structure, such an 
approach is known as a hybrid sampler, see comparisons in Brewer, Aitken and Talbot 
(1996). Alternatively, for the global move if determination of level sets in multiple dimen- 
sions is not problematic, for the model under consideration, then some of the multivariate 
Slice sampler approaches designed to account for correlation between parameters can be 
incorporated, see Neal (2003) for details. This is beyond the scope of this paper. 

Another approach to break correlation between parameters in the posterior is via 
transformation of the parameter space. If the transformation is effective this will reduce 
correlation between parameters of the transformed target posterior. Sampling can then 
proceed in the transformed space, and then samples can be transformed back to the 
original space. It is not always straightforward to find such transformations. 

A third alternative is based on Simulated Tempering, introduced by Marinari and 
Parisi (1992) and discussed extensively in Geyer and Thompson (1995). In particular a 
special version of Simulated Tempering, first introduced by Neal (1996) can be utilised 
in which on considers a sequence of target distributions {tti} constructed such that they 
correspond to the objective posterior in the following way, 

TTi = [tT (0a, Ai:T, Op\ni.,T, Si-.k)]^' 

with sequence {7;} . Then one uses the Slice sampling algorithm presented and replaces 

IT with TTi. 

Running a Markov chain such that at each iteration / we target posterior tti and then 
only keeping samples from the Markov chain corresponding to situations in which 7; = 1 
can result in a significant improvement in exploration around the posterior support. This 
can overcome slow mixing arising from a univariate sampling regime. The intuition for 
this is that for values of 7/ << 1 the target posterior is almost uniform over the space, 
resulting in large moves being possible around the support of the posterior, then as 7^ 
returns to a value of 1, several iterations later, it will be in potentially new unexplored 
regions of posterior support. 
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As an extension we developed a Simulated Tempering Slice sampler to obtain samples 
from the posterior p {6^, Xi:t, Op\ni:T, Si.k)- In our development we utilize a sine function, 
7j = min (sin {^qI) + 1, l), for •ji which has its amplitude truncated to ensure it ranges 
between (0, 1]. That is the function is truncated at 7; = 1 for extended iteration periods 
for our simulation index / to ensure the sampler spends significant time sampling from 
the actual posterior distribution. 

We note that application of the Tempering proved useful and improved mixing of the 
Markov chain. However, for simulation examples presented in the remainder of this paper 
it was sufficient to use the basic univariate Slice sampler presented previously, which is 
more computationally efficient than the Tempered version. 

Note, in the application of Tempering one must discard many simulated states of the 
Markov chain, whenever 7^ 7^ 1. There is however a computational way to avoid discarding 
these samples, see Gramacy, Samworth and King (2007). 

Finally, we note that there are several alternatives to a MH within Gibbs sampler 
such as a basic Gibbs sampler combined with Adaptive Rejection sampling (ARS), Gilks 
and Wild (1992). Note ARS requires distributions to be log concave. Alternatively an 
adaptive version of this known as the Adaptive Metropolis Rejection sampler could be 
used, see Gilks, Best and Tan (1995). 

6 Results 

In this section we demonstrate and compare the performance of our sampling methodology 
on several different copula models. We intend to demonstrate the appropriate behaviour 
of our Bayesian models as a function of the number of annual years, in the presence of 
highly biased expert opinions. This will be achieved through simulation studies using 
the sampling techniques detailed above to perform inference on model parameters. The 
intention will be to demonstrate the appropriate convergence and accuracy as a function 
of data sample size. Hereafter, we study the case of dependence between intensities of 
two risks and set risk cell volumes V^^^ = V^"^^ = 1. 
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6.1 Estimation of model if copula parameter is known 

Here, we study the estimation of model parameters in two cases. The first case involves 
two low frequency risks. In the second case, one risk has low frequency while another risk 
has high frequency. In these two cases we present results for the univariate Slice sampler 
under scenarios involving: data generated independently for each risk profile and data 
generated using a Gaussian, Clayton and Gumbel copulas. 

Only one expert opinion is assumed for each risk. We present the parameter estimates 
as a function of data size for each of the specified correlation levels. That is, we study the 
accuracy of the parameter estimates as the number of observations increases. Simulation 
results are obtained by creating independently 20 data sets each of length 20 years, then 
for each data set simulations are performed for subsets of the data going for 1, 2, 5, 
10, 15 and 20 years. We then average the performance of posterior estimates over these 
independent simulations. The Markov chains are run for 50,000 iterations with 10,000 
iterations discarded as burnin. The simulation time depends on the number of risk profiles, 
the number of observations and expert opinions and the length of the Markov chaiijl]. In 
performing the analysis we studied three cases and in each case we performed the following 
steps, 

1. Simulate a data set of appropriate number of years according to the procedure 
specified in Appendix A. 

2. Obtain correlated MCMC samples from the target posterior distribution after dis- 
carding burnin samples, {^a,«7 ^1:T,i} , i = 1001, . . . , 50000. 

3. Estimate desired posterior quantities such as posterior mean of parameters of inter- 
est and posterior standard deviations. 

4. Repeat stages 1 - 3 for 20 independent data realisations and then average the results. 

• Joint: The results are obtained by MCMC samples taken from tt (^a, Ai;r|ni.r, di-K, 9p) 

typical run with 5 years of data and 1 expert in the bivariate case for 50,000 simulations took 
approximately 50scc and approximately 43min for the case of ten risk profiles when coded in Fortran and 
run on 2.40GHz Intel Core2. 



26 



with the correct copula model and copula parameter used in the sampler. This is 
the procedure that should be performed in a real application. 

• Marginal: Results are obtained by MCMC samples taken from 



which is the posterior in the case of independence. This is equivalent to marginal 
estimation where single risk cell data is analyzed separately, see Section 4.1. 

• Benchmark: To verify the results we also consider the case where we assume 
perfect knowledge of the realized random process for random vector Ai:^. We then 
perform inference on ©a without the additional uncertainty arising from estimating 
Ai-T- In this regard this represents a benchmark for which we may compare the 
performance of our simulations. In particular, it is obtained by samples taken from 
71 {6\\\i-T,ni:T,Si-K,0p) conditional on the true simulated realizations of random 
variables Ai^t- 

Example 1: low frequency risk profiles. We set the true parameter values of O^^'* and 
Oj^^ to be ol-ll^^ = 5 and 6^11^^ = 5 respectively. Also we choose the expert's opinion on the 
true parameters to be an underestimate in risk profile 1 with A^^'' = 2 and an overestimate 
for risk profile 2 with A^^"* = 8. The model parameters were set to ^'^^^ = = 2, 
a^^^ = a*-^-* = 2, and prior distribution parameters a*-^-* = a*-^-* = 2, b^^^ = 6^^^ = 2.5. 
The results for this simulation study, presented in Tables [1] and [21 show the appropriate 
convergence of the estimates of parameters Q^^^ and Q^^^ as a function of the data size, 
demonstrating how well this simulation procedure works under these models. In addition 
we note that as expected from credibility theory we observe that joint estimation is better 
than the marginal, i.e. the posterior standard deviations for 0^^^^ and Oj^"* are less when 
joint estimation is used. In addition the rate of convergence of the posterior mean for Oa 
to the true value is faster under the joint estimation. Note, the standard errors in the 
posterior mean and standard deviation were calculated and found to be strictly in the 
range of 1-5% for the simulations presented. 
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In Figure [3], corresponding to Gaussian, Clayton and Gumbel copula models respec- 
tively, we demonstrate the estimated density ti {6^\Xi,T,ni;T, ^i-.K^Op) if we had perfect 
knowledge of the latent process parameters Ai^t- In this way we compare the exact poste- 
rior with perfect knowledge of the correlation structure as captured by the copula model 
which here we assume is known. Obtaining these plots involved a particular realized data 
set of length 20 years. For all copulas two values of Op were considered: Op = 0.9 and 
Op = 0.1 for Gaussian copula; Op = 10 and Op = 1 for Clayton copula; and Op = 3 and 
Op = 1.1 for the Gumbel copula. These plots of the joint marginal posterior distribution of 
©A demonstrate clearly that the standard practice in the industry of performing marginal 
estimation of risk profiles will lead to incorrect results when estimating quantities based 
on the distribution of ©a- 

Example 2: one low frequency and one high frequency risk profile. We set the 

true values of Q^l^ and 6^^^ to be oHl^^ = 5 and ol'^l^ = 10 respectively. Also we choose 
the expert's opinion on the true parameters to be an under estimate in risk profile 1 with 
A^^^ = 2 and an over estimate for risk profile 2 with A^^'' = 13. The model parameters 
were set to {(^^ = ^(2) = 2, a^^) = 2,a(2) = 2, a^^) = a^^) = 2, 6(1) = 2.5,6(2) = 5. The 
simulation results and comparisons are developed in the same approach as Example 1 and 
again the standard errors in the estimates were in the range 1-5%. The results can be 
found in Tables [3] and HI 

6.2 Joint estimation of marginal and copula parameters 

Here we estimate G^^^Oj^^ and Op jointly. For this example, the model settings from 
Example 1 were used and one data set of length 20 years randomly generated was utilized. 
The simulation was performed by taking 150,000 iterations of the sampler and discarding 
the first 20,000 as burnin. Results for these simulations are contained in Table O 

These results demonstrate that our model and estimation methodology is successfully 
able to estimate jointly the risk profiles and the correlation parameter. This is seen to 
be the case for all the models we consider in this paper. It is also clear that with few 
observations, e.g. T < 5, and a vague prior for the copula parameter, it will be difficult 
to accurately estimate the copula parameter. This is largely due to the fact that the 
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posterior distribution in this case is diffuse. Additionally, with a small amount of data it 
appears that accurately estimating the copula parameter is most difficult in the Gumbel 
model. However, as the number of observations increases the accuracy of the estimate 
improves in all models and the estimates are reasonable in the case of 15 or 20 years 
of data. Additionally, we could further improve the accuracy of this prediction if we 
incorporated expert opinions into the prior specification of the copula parameter, instead 
of using a vague prior. 

Overall, we have demonstrated that combination of all the relevant sources of data 
can be achieved under our model. Then with this study we show that our sampling 
methodology has the ability to estimate jointly all the model parameters including the 
copula parameter. This is a key step forward in model development and estimation 
for OpRisk models. We further envisage that one can extend this methodology to more 
sophisticated and flexible copula based models with more than one parameter. This should 
be relatively trivial since the methodology we developed applies directly. However, the 
challenge in the case of a more sophisticated copula model relates to finding a relevant 
choice of prior distribution on the correlation structure. 

Full predictive distribution. As a final comment in this section we point out an 
important additional outcome of obtaining samples from the joint posterior distribution 
of the model parameters and the correlation. This relates to construction of the full 
predictive annual loss distribution, accounting for parameter uncertainty. 

Typically practitioners will take point estimates of all parameters and then condition 
on these point estimates to empirically construct the predictive distribution and then 
calculate risk measures to be reported such as VaR. Here we comment that a more robust 
approach to prediction can now be performed. Using our methodology, we can construct 
the full predictive distribution after removing the parameter uncertainty from the model 
considered, including the uncertainty arising from the correlation parameter. To achieve 
this we would consider the full predictive distribution: 

TT (Zr+i|ni:T, ^i:t) = J ^ {Zt+i\Oa, Op) IT (Oa, 9p\ni.,T, 5i:t) dO^dOp. (6.1) 
Here, we used the model assumptions that given ©a and 9p we have that Zt+i is in- 
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dependent from the observations (A^i-^, Ai-^^). In practice to obtain samples from this 
full predictive distribution involves taking the steps demonstrated in Appendix A with a 
minor modification. If one wanted to simulate L annual losses from the full predictive dis- 
tribution, this would involve first running the Slice sampler for L iterations after burnin. 
Then for each iteration / one would use the state of the Markov chain (0a,/, 9p^i) in the 
simulation procedure detailed in Appendix A. We also note that it is trivial under our 
methodology to extend this full predictive distribution sampling to the case of frequency 
and severities. 

7 Discussion 

This paper introduced a dynamic OpRisk model which allows for significant flexibility in 
correlation structures introduced between risk profiles. Next a Bayesian framework was 
established to allow inference and estimation under this model to be performed, whilst at 
the same time allowing incorporation of alternative data sources into the inferential proce- 
dure. Then a novel simulation procedure was developed for the Bayesian model presented, 
in the case of dependence between frequency risk profiles. Simulations were performed to 
demonstrate the accuracy of this procedure in multiple bivariate examples. Comparisons 
were made between marginal estimation and a benchmark estimation procedure. In all 
simulations, the estimation of the model parameters was accurate and behaviour of the 
estimates of posterior mean and standard deviation presented, smoothed over multiple 
data realizations, was as expected. Initially the influence of the biased expert opinion 
observation influenced the results and as the size of the data set for actual annual loss 
counts grew, the estimations improved in accuracy. Clearly, the joint estimation will out- 
perform marginal estimation when forming predictions of future counts and rates in year 
T + 1, given estimates based on data up to year T . Additionally, we demonstrated highly 
accurate estimation of the copula parameter, jointly with the model parameters. 

Additionally, simulations were performed in the models J = 5 and J = 10 for the Clay- 
ton copula model in which the copula parameter is also unknown. Though the simulation 
time was increased as a factor of the number of risk cells, the results and performance 
were as presented for the bivariate models, making this approach suitable for practical 
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purposes. 

Finally, the main objective of the paper is to preset the framework for the multivariate 
problem and to demonstrate estimation in this setting. Application of the framework to 
real data is the subject of further research. In this paper, the estimation procedure is 
presented for frequencies only but it can be extended in the same manner as presented to 
severities. 

A Appendix: Simulation of annual losses 

In general, given marginal and copula parameters {6\, Oif,, Op), the simulation of the annual 
losses for year t = T + 1, when risk profiles are dependent, can be done as described below. 

1. Simulate 2J-variate Mi, . . . , uj, Vi, . . . ,vj from a 2 J dimensional copula C{-\Op). 

2. Calculate A^^^'^ = G'^ and ^p'^ = H^^ (t^jl^?) , j = 1, . . . , J 

3. Sample np^ from P ^-lA^^j , j = 1, . . . , J. 

4. Sample iid xi-'\t), s = 1, . . . , n[-'\j = 1, . . . , J from F (^■\ipt''^^ ■ 

5. Calculate annual losses z^''^ = ^ xi''^ (t) ,j = 1, . . . , J. 

6. Repeat Steps 1-5, K times to get K random samples of the annual losses zl''\ 

Note, to simulate from the full predictive distribution of annual losses, add simulation of 
{6\,6q,,6p) from the posterior distribution (e.g. using Slice sampler methodology) as an 
extra step before Step 1. Simulation of the random variates from a copula in Step 1 in 
the case of Gaussian, Clayton and Gumbel copulas can be done as described below. 
Gaussian copula: 

1. Simulate d-variate Xi, . . . ,Xd from (0, S), where $7v (0, S) is a Normal distribu- 
tion with zero means, unit variances and correlation matrix S. 

2. Calculate ui = F/v(xi), . . . ,Ud = Fj\f{xd)- Obtained . . . , Ud) is a d-variate from 
a Gaussian copula. 
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Archimedean copulas: The Clayton and Gumbel copulas are members of the Archimedean 
family of copulas. The dimensional Archimedean copulas can be written as 

C (ui, . . . , Ud\p) = (j)-^ {(f) (ui) + ■■■ + (/> M) (A.l) 

with (f) a decreasing function known as the generator for the given copula, see Frees and 
Valdez (1998). The generator and inverse generator for the Clayton {(pc) and Gumbel 
copulas are given by 

0c (t) = (t^'-i); 0c'(^) = (i + ^)"'; 

0gW = (-Intf; (s) = exp (-s^) , (A.2) 

where p is a copula parameter. Simulation from such a copula can be achieved following 
the algorithm provided in Melchiori (2006): 

1. Sample d independent random variates Vi, . . . ,Vd from a uniform distribution U[0, 1]. 

2. Simulate y from D{-) such that Laplace transform of D satisfies C (D) = and 
D (0) = 0. 

3. Find Si = — (Invi) /y for i = 1, d 

4. Calculate Ui = (sj) ioi i = 1, . . . ,d. 

The obtained {ui, . . . , Ud) is a d-vanate from d-dimensional Archimedean copula. What 
remains is to define the relevant distribution D{-) for the Clayton and Gumbel Copulas. 
For the Clayton copula, D{-) is a Gamma distribution with shape parameter given by 
and unit scale. For the Gumbel copula, D{-) is from the a— stable family Sa {/3,^,6) 
with the following parameters shape a = p~^, skewness (3 = 1, scale 7 = (cos(|7r/p))^ 
and location 5 = 0. In the Gumbel case, the density for D has no analytic form and the 
simulation from this distribution can be achieved using the algorithm from Nolan (2007) 
to efficiently generate the required samples from the univariate stable distribution. 

B Appendix: Full conditional posterior distributions 

Note, in Part 1 and Part 2 we are conditioning on the copula parameter 6p, this notation 
is dropped for simplicity. It is only explicitly introduced in Part 3. 
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Part 1: Using Bayes' theorem 

TT {9l^^\e[-'\ Ai;T, ni.T, S,..k) oc tt (^o[-'\ \,.,t, n,.,T, ^Ji:,^^?) vr (^i^')) 

= TT (Ai.^, m.T, ^i^'^) vr (0i-^>J')) vr (^i^')) . (B.l) 

Using the model structure to exploit conditional independence properties we note that 

= TT (ni.rl Ai.r) TT (Ai^tI^I^^^ e'i^^ TT (^i^A' | (B.2) 
which specifies the full conditional distributions for the j*^' component Q^f' as 

71" (6*1^ Va~^^ '^l:'^, ""-hT, OC 7r(ni;r|Ai:T) 

oc Tr (Ai^^l^i-^), ^i^)) vr ((5,.^|0l-^), ^l^^) vr (ei'^ V?) vr (^?) . (B.3) 

Part 2: The next full conditional distribution we must specify is given by 

vr (aFVa, X^-^'-^\ n,.,T. S^.k) cc vr (o^, X[t^'-^\ n,.,r, 5,.,k\\?) vr (a^^) 

= vr n,.,T, d,..K\eA, Ap')) vr (^aIA^) vr (a^'^) . (B.4) 

We then use conditional independence properties of the model to get 

= vr {m.,T\X^J^-'\ aS^)) vr [x^J'-'^lOj,, X?) vr {S^-.M (B.5) 

giving the full conditional we are interested in, up to proportionality, 

vr (aS^>a, X^J-'\ n,.,T, S.-.k) oc vr {n,.,T\X^-^^-'\ \['^) 

X vr {6,..K\eA) vr (aS:^*'-^>a, aS^'^) vr (^aIA^) vr (aI^'^) . (B.6) 

We now demonstrate that this expression simplifies significantly. We can show that the 
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terms vr (^X[t^'-^^\0a, X'f^^ tt [OaIX?^ it (^A?^) simplify as follows: 



. ( aS:*-^>a, aS^-)) vr fe.lAF') vr f A^) = ^ ^ vr (^.|A1^)) vr (aS^) 



7r(AS:^*'-^\AS^>A)vr(0A) 

^ /-^ vr ( e'AlA ■'M vr ( ApM 

- ^e^ix^^) vr (a1^-)^ ^ ^ ^ ^ 



vr 



Ai:^'-^\ aP>a) vr (0a) . (B.7) 
Finally, we are left with the full conditional distribution 



X vr (5i.k|0a) vr (A;r^'"^>A, aJ^'^) vr (^aIAS^'^) vr ( 
OC vr {n,..T\X^-^'-'\ X?) vr [x^^^'~'\ Ap^. 



(i) 



OC vr {n,..T\X^^^'"'\ X?) vr (aS"^), aS^>a) • (B.8) 

Part 3: The full conditional distribution for the copula parameter is given by 

vr {9p\0a, Ai;r, ni,T, Si.k) oc vr (6^, Xi.t, tii.t, Si.xlOp) vr {9p) 
OC vr Ai:t) vr {Si-.kIOa) vr(Ai:T|0A, 6'p)vr (^a) vr {9p) 
cxvr(Ai:T|6>A,^p)vr(^p). (B.9) 

C Appendix: Slice sampler algorithm. 

Here, we provide the explicit details involved into implementation of a Slice sampler 
algorithm within a Gibbs sampler framework discussed in Section 5. The iterations of the 
Slice sampler are denoted by simulation index / G N. 

Slice sampling: 

1. Initialize I = the parameter vector [6^,0, Xi-ta, Opfl] randomly or deterministically. 

2. Repeat while / < L 
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(a) Set [^A,«, ^p,i] = [Ok,i~i-,^i:t,i-i-,0p,i-i\ 

(b) Sample j uniformly from set {1, 2, . . . , J} 

Sample new parameter value O^f^ from the full conditional posterior distribution 

TT {df'\0\^j\ \i:T,h ni,T, ^l-.K, dp,l^ ■ 

Set e^j = of. 

(c) Sample j uniformly from set {1, 2, . . . , J} and t uniformly from set {1, . . . , T} 
Sample new parameter value Ap^ from the full conditional posterior distribution 

TT ^Ap^|0A,i5 ^\:T,l "^l.T-, ^1:K, dp,l^ ■ 

Set \^ = \^\ 

(d) Sample new parameter value 6p from the full conditional posterior distribution 

TT {9p\0A,l, \l:T,l, ni.,T, Si;k)- 

Set 6p^i = 9 p. 
3. 1 = 1 + 1 and return to 2. 



The sampling from the full conditional posteriors in stage 2 uses a univariate Slice sampler, 
see Figure[2l We present the case where we wish to sample the next iteration of the Markov 
chain from vr (6'[^Va7^ '^i^t.'' '^^■■T^ ^i--K, 



Obtaining a sample using a univariate Slice sampler: 

1. Sample u from a uniform distribution 



U 



0' ^ [^aIWa,P ^ ^i-T,i, ni;T, Si;K, 6/^ 



2. Sample uniformly from the intervals (level set) 



A = i9\- : vr 



-)'j(^\0[-p,X,.,T,i,n,..T,di..K,ep) >u\. 



There are many approaches that could be used in determination of the level sets A of our 
density 



( ^a ''|^A /''' -^1:^,^ ''^l.T, Sl;K, 
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see Neal (2003) [p. 712, Section 4]. For simplicity in our proceeding examples we assume 
that we can restrict our parameter space to the finite ranges and we argue that this is 
reasonable since we can consider the finite bounds for example set according to machine 
precision for the smallest and largest number we can represent on our computing platform. 
This is not strictly required, but simplifies the coding of the algorithm. We then perform 
what Neal (2003) terms a stepping out and a shrinkage procedure, the details of which are 
contained in Neal (2003) [p. 713, Figure 1]. The basic idea is that given a sampled vertical 
level u then the level sets A can be found by positioning an interval of width w randomly 
around 6'^]. This interval is expanded in step sizes of width w until both ends are outside 
the slice. Then a new state is obtained by sampling uniformly from the interval until a 
point in the slice A is obtained. Points that fail can be used to shrink the interval. 
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Year 


1 2 5 10 15 20 


Independent 


Marginal 


3.72 (2.04) 4.10 (1.98) 4.08 (1.62) 4.64 (1.42) 5.13 (1.31) 5.24 (1.27) 


Gaussian copula with (p — 0.9) 


Benchmark 
Joint 
Marginal 


4.32 (1.88) 4.50 (1.67) 4.84 (1.46) 5.17 (1.26) 5.19 (1.12) 5.21 (1.02) 
3.91 (2.01) 4.41 (1.72) 4.37 (1.56) 4.76 (1.33) 5.10 (1.21) 4.95 (1.05) 
3.72 (2.05) 4.09 (1.97) 4.06 (1.61) 4.48 (1.37) 5.07 (1.29) 5.04 (1.13) 


Clayton copula with (p = 10) 


Benchmark 
Joint 

Marginal 


4.81 (1.82) 5.17 (1.72) 5.13 (1.42) 4.96 (1.13) 5.10 (0.98) 5.00 (0.84) 
4.19 (2.03) 4.92 (1.87) 5.05 (1.56) 4.87 (1.26) 4.96 (1.08) 4.90 (0.93) 
3.91 (2.12) 4.43 (2.10) 4.54 (1.74) 4.47 (1.36) 4.75 (1.22) 4.72 (1.08) 


Gumbel copula with (p = 3) 


Benchmark 
Joint 
Marginal 


4.32 (1.98) 4.46 (1.70) 4.86 (1.41) 5.08 (1.16) 5.16 (1.01) 5.11 (0.88) 

4.33 (2.06) 4.21 (1.80) 4.54 (1.56) 4.96 (1.23) 5.01 (1.05) 4.98 (0.93) 
3.84 (2.08) 3.76 (1.87) 4.17 (1.62) 4.63 (1.41) 4.74 (1.22) 4.72 (1.07) 



Table 1: Average estimates of posterior mean and standard deviation of for 20 data sets. Data are 
generated using different copula models as specified. The true values are ^^J^^Ie ~ ^trie — 5. 
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Year 


1 2 5 10 15 20 


Independent 


Marginal 


6.74 (2.74) 6.84 (2.59) 6.46 (2.16) 5.91 (1.67) 5.74 (1.40) 5.47 (1.31) 


Gaussian (p — 0.9) 


Benchmark 
Joint 
Marginal 


5.98 (2.29) 5.84 (2.04) 5.46 (1.60) 5.47 (1.31) 5.43 (1.14) 5.41 (1.04) 
6.37 (2.55) 6.01 (2.23) 5.63 (1.80) 5.40 (1.43) 5.43 (1.25) 5.36 (1.12) 
6.59 (2.72) 6.49 (2.54) 6.01 (2.07) 5.75 (1.64) 5.62 (1.43) 5.57 (1.26) 


Clayton {p = 10) 


Benchmark 
Joint 

Marginal 


5.57 (1.91) 5.41 (1.69) 5.20 (1.40) 4.90 (1.10) 5.09 (0.96) 5.07 (0.85) 
6.39 (2.48) 5.92 (1.92) 5.36 (1.64) 5.06 (1.22) 5.13 (1.17) 5.00 (1.02) 
6.69 (2.74) 6.56 (2.55) 5.92 (2.04) 5.40 (1.56) 5.37 (1.36) 5.24 (1.17) 


Gumbel (p = 3) 


Benchmark 
Joint 
Marginal 


5.83 (2.35) 5.51 (2.02) 5.38 (1.57) 5.15 (1.18) 5.20 (1.02) 5.12 (0.89) 
6.05 (2.47) 5.96 (2.17) 5.47 (1.76) 5.21 (1.27) 5.12 (1.07) 5.12 (0.94) 
6.42 (2.67) 6.26 (2.50) 5.92 (2.04) 5.67 (1.62) 5.52 (1.37) 5.36 (1.18) 



Table 2: Average estimates of posterior mean and standard deviation of Oj^ for 20 data sets. The data 
are generated using different copula models as specified. The true values are o[l\^, ~ ^true ~ 5. 
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Year 


1 2 5 10 15 20 


Independent 


Marginal 


3.72 (2.04) 4.07 (1.97) 4.05 (1.61) 4.48 (1.37) 4.94 (1.26) 5.13 (1.13) 


Gaussian (p — 0.9) 


Benchmark 
Joint 
Marginal 


4.07 (1.76) 4.22 (1.53) 4.61 (1.36) 5.10 (1.20) 5.10 (1.08) 5.20 (0.99) 
3.86 (1.96) 4.39 (1.86) 4.46 (1.51) 4.84 (1.33) 5.10 (1.22) 5.24 (1.10) 
3.72 (2.04) 4.08 (1.97) 4.05 (1.61) 4.48 (1.37) 4.94 (1.26) 5.13 (1.13) 


Clayton {p = 10) 


Benchmark 
Joint 

Marginal 


4.45 (1.65) 4.86 (1.55) 4.89 (1.32) 4.82 (1.07) 5.00 (0.95) 4.92 (0.83) 
4.15 (2.01) 4.84 (1.95) 4.92 (1.59) 4.69 (1.33) 4.97 (1.20) 4.96 (1.04) 
3.98 (2.10) 4.53 (2.09) 4.54 (1.74) 4.47 (1.36) 4.74 (1.21) 4.72 (1.08) 


Gumbel (p = 3) 


Benchmark 
Joint 
Marginal 


4.14 (1.90) 4.20 (1.58) 4.65 (1.32) 4.95 (1.11) 5.06 (0.97) 5.04 (0.87) 
4.36 (2.16) 4.17 (1.85) 4.68 (1.57) 5.10 (1.34) 5.21 (1.21) 5.24 (1.01) 
3.84 (2.17) 3.75 (1.87) 4.17 (1.62) 4.64 (1.41) 4.75 (1.22) 4.79 (1.09) 



Table 3: Average estimates of posterior mean and standard deviation of for 20 data sets. Data are 
generated using different copula models as specified. The true values are ^^J^^^e = 5 and o[fl^^ — 10. 
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Year 


1 2 5 10 15 20 


Independent 


Marginal 


10.89 (3.74) 10.78 (3.60) 10.18 (3.19) 9.70 (2.67) 9.64 (2.31) 9.48 (2.14) 


Gaussian (p = 0.9) 


Benchmark 
Joint 
Marginal 


10.44 (3.48) 10.50 (3.24) 10.25 (2.81) 10.51 (2.39) 10.78 (2.05) 10.04 (1.88) 
10.68 (3.63) 10.13 (3.35) 9.57 (2.87) 9.60 (2.36) 9.31 (2.03) 9.23 (1.82) 
10.89 (3.74) 10.78 (3.60) 10.18 (3.19) 9.70 (2.67) 9.64 (2.31) 9.48 (2.15) 


Clayton {p 10) 


Benchmark 
Joint 
Marginal 


10.04 (3.21) 10.07 (2.99) 9.88 (2.58) 9.59 (2.08) 9.97 (1.87) 9.97 (1.66) 
10.58 (3.54) 10.03 (3.19) 9.29 (2.69) 9.82 (2.22) 9.93 (1.97) 9.78 (1.73) 
10.94 (3.75) 10.92 (3.61) 10.13 (3.17) 9.32 (2.60) 9.30 (2.25) 9.45 (1.96) 


Gumbel (p = 3) 


Benchmark 
Joint 
Marginal 


10.10 (3.55) 9.88 (3.23) 10.08 (2.74) 9.98 (2.22) 10.17 (1.95) 10.06 (1.74) 
10.10 (3.61) 10.18 (3.39) 9.44 (2.87) 9.98 (2.33) 9.85 (1.99) 9.63 (1.75) 
10.61 (3.76) 10.51 (3.59) 10.11 (3.17) 9.70 (2.66) 9.45 (2.27) 9.39 (1.98) 



Table 4: Average estimates of posterior mean and standard deviation of QfJ for 20 data sets. Data are 
generated using different copula models as specified. The true values arc 9\^\^ = 5 and fi'j^^g — 10. 
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Year 


1 2 5 10 15 20 


Posterior mean and standard deviation for G^^"* 


Independent 


2.81 (1.75) 4.71 (2.21) 3.05 (1.29) 4.90 (1.48) 4.38 (1.14) 5.13 (1.15) 


Gaussian [p = 0.9) 


2.83 (1.74) 4.49 (2.02) 3.31 (1.38) 4.88 (1.29) 4.36 (1.10) 5.07 (1.09) 


Clayton (p = 10) 


4.27 (2.04) 3.80 (1.76) 4.14 (1.46) 5.62 (1.45) 4.53 (1.04) 4.90 (0.99) 


Gumbel (p = 3) 


2.81 (1.40) 2.59 (1.26) 3.08 (1.17) 4.28 (1.26) 4.51 (1.12) 4.91 (0.95) 


Posterior mean and standard deviation for O^^"* 


Independent 


10.24 (3.97) 9.56 (3.56) 9.48 (3.17) 9.10 (2.58) 9.33 (2.24) 9.55 (1.98) 


Gaussian [p = 0.9) 


10.23 (3.92) 10.85 (3.52) 8.72 (2.95) 8.91 (2.12) 8.58 (2.04) 9.94 (1.85) 


Clayton (p = 10) 


11.40 (3.58) 10.76 (3.47) 10.85 (3.10) 11.39 (2.60) 10.76 (2.27) 10.17 (2.03) 


Gumbel (p = 3) 


12.27 (3.63) 11.26 (3.59) 9.11 (2.93) 9.54 (2.26) 9.91 (1.72) 9.88 (1.17) 


Posterior mean and standard deviation for 0^ 


Independent 


0.20 (0.53) 0.10 (0.44) -0.10 (0.38) -0.02 (0.30) -0.17 (0.28) -0.12 (0.16) 


Gaussian (p = 0.9) 


0.21 (0.54) 0.47 (0.39) 0.61 (0.30) 0.66 (0.24) 0.70 (0.19) 0.74 (0.15) 


Clayton (p = 10) 


5.37 (2.81) 5.83 (2.66) 6.20 (2.52) 6.48 (2.25) 6.70 (2.11) 8.24 (1.88) 


Gumbel (p — 3) 


16.41 (8.33) 16.59 (8.19) 16.39 (8.09) 9.42 (8.76) 5.12 (6.21) 3.90 (4.80) 



Table 5: Posterior estimates for 05^ ,6^ ^'^'^ copula parameter Qp. In this case a single data set is 
generated using different copula models as specified. Posterior standard deviations are given in brackets 
next to estimate. Joint estimation was used. 
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Figure 1: Spearman's rank correlation, psR Z^^^^, between the annual losses vs copula parameter 

p, also see Section 3. (o) pgj^ {Z^^\ Z'^^) vs copula parameter p between frequency risk profiles a|^' and 
Ap^; (*) psR {Z^^\ Z'^)) vs copula parameter p between severity risk profiles ^'j^' and ^'P'. 
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Figure 2: Markov chain created for Oa and auxiliary random variable U, 

(wi, 9a,i) , • ■ • , (ui-i, 6'a_;_i) , (u;, 6'A,i) , • . • has stationary distribution with the desired marginal distribu- 
tion p (0\) . 
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Figure 3: Scatter plot of (e^^',ej^^^) from 7r(6»A| "1:20, f^i:!, Ai:2o, ^p) witli Gaussian, Clayton and 
Gumbel copulas C{-\9p = p) between frequency risk profiles. Top row: strong correlation. Bottom row: 
weak correlation 
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